On Discovering Co-Location Patterns in Datasets: A Case Study of Pollutants and Child Cancers

نویسندگان

  • Jundong Li
  • Aibek Adilmagambetov
  • Osmar R. Zaïane
  • Alvaro Osornio-Vargas
  • Osnat Wine
چکیده

We intend to identify relationships between cancer cases and pollutant emissions by proposing a novel co-location mining algorithm. In this context, we specifically attempt to understand whether there is a relationship between the location of a child diagnosed with cancer with any chemical combinations emitted from various facilities in that particular location. Colocation pattern mining intends to detect sets of spatial features frequently located in close proximity to each other. Most of the previous works in this domain are based on transaction-free apriori-like algorithms which are dependent on user-defined thresholds, and are designed for boolean data points. Due to the absence of a clear notion of transactions, it is nontrivial to use association rule mining techniques to tackle the co-location mining problem. Our proposed approach is focused on a grid based transactionization of the geographic space, and is designed to mine datasets with extended spatial objects. It is also capable of incorporating uncertainty of the existence of features to model real world scenarios more accurately. We eliminate the necessity of using a global threshold by introducing a statistical test to validate the significance of candidate co-location patterns and rules. Experiments on both synthetic and real datasets reveal that our algorithm can detect a considerable amount of statistically significant co-location patterns. In ⋆ The work is done when the author is at University of Alberta. Jundong Li Computer Science Engineering, Arizona State University, Tempe, Arizona, USA E-mail: [email protected] Aibek Adilmagambetov · Mohomed Shazan Mohomed Jabbar · Osmar R. Zäıane Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada E-mail: {adilmaga, mohomedj, zaiane}@ualberta.ca Alvaro Osornio-Vargas · Osnat Wine Department of Pediatrics, University of Alberta, Edmonton, Alberta, Canada E-mail: {osornio, osnat}@ualberta.ca

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling of Air Pollutants’ Dispersion by Means of CALMET/CALPUFF (Case Study: District 7 in Tehran city).

The current study aims at modelling the dispersion of two pollutants, namely CO (carbon monoxide) and SO2 (sulfur dioxide) released from District 7 of Tehran Municiaplity, from 20 main line sources, by means of CALPUFF modeling system. CALPUFF is a non-steady state puff modeling software which employs meteorological, terrain, and land-use data to effectively simulate air pollutants' dispersion ...

متن کامل

Modeling of Air Pollutants’ Dispersion by Means of CALMET/CALPUFF (Case Study: District 7 in Tehran city).

The current study aims at modelling the dispersion of two pollutants, namely CO (carbon monoxide) and SO2 (sulfur dioxide) released from District 7 of Tehran Municiaplity, from 20 main line sources, by means of CALPUFF modeling system. CALPUFF is a non-steady state puff modeling software which employs meteorological, terrain, and land-use data to effectively simulate air pollutants' dispersion ...

متن کامل

Mining Of Spatial Co-location Pattern from Spatial Datasets

Spatial data mining, or knowledge discovery in spatial database, refers to the extraction of implicit knowledge, spatial relations, or other patterns not explicitly stored in spatial databases. Spatial data mining is the process of discovering interesting characteristics and patterns that may implicitly exist in spatial database. A huge amount of spatial data and newly emerging concept of Spati...

متن کامل

Discovering Regional Co-location Patterns for Sets of Continuous Variables in Spatial Datasets

This paper proposes a novel framework for mining regional co-location patterns with respect to sets of continuous variables in spatial datasets. The goal is to identify regions in which multiple continuous variables with values from the wings of their statistical distribution are co-located. A co-location mining framework is introduced that operates in the continuous domain without and which vi...

متن کامل

Understanding Temporal Human Mobility Patterns in a City by Mobile Cellular Data Mining, Case Study: Tehran City

Recent studies have shown that urban complex behaviors like human mobility should be examined by newer and smarter methods. The ubiquitous use of mobile phones and other smart communication devices helps us use a bigger amount of data that can be browsed by the hours of the day, the days of the week, geographic area, meteorological conditions, and so on. In this article, mobile cellular data mi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • GeoInformatica

دوره 20  شماره 

صفحات  -

تاریخ انتشار 2016